Galway
Evaluating Small Vision-Language Models on Distance-Dependent Traffic Perception
Theodoridis, Nikos, Brophy, Tim, Mohandas, Reenu, Sistu, Ganesh, Collins, Fiachra, Scanlan, Anthony, Eising, Ciaran
Vision-Language Models (VLMs) are becoming increasingly powerful, demonstrating strong performance on a variety of tasks that require both visual and textual understanding. Their strong generalisation abilities make them a promising component for automated driving systems, which must handle unexpected corner cases. However, to be trusted in such safety-critical applications, a model must first possess a reliable perception system. Moreover, since critical objects and agents in traffic scenes are often at a distance, we require systems that are not "shortsighted", i.e., systems with strong perception capabilities at both close (up to 20 meters) and long (30+ meters) range. With this in mind, we introduce Distance-Annotated Traffic Perception Question Answering (DTPQA), the first Visual Question Answering (VQA) benchmark focused solely on perception-based questions in traffic scenes, enriched with distance annotations. By excluding questions that require reasoning, we ensure that model performance reflects perception capabilities alone. Since automated driving hardware has limited processing power and cannot support large VLMs, our study centers on smaller VLMs. More specifically, we evaluate several state-of-the-art (SOTA) small VLMs on DTPQA and show that, despite the simplicity of the questions, these models significantly underperform compared to humans (~60% average accuracy for the best-performing small VLM versus ~85% human performance). However, it is important to note that the human sample size was relatively small, which imposes statistical limitations. We also identify specific perception tasks, such as distinguishing left from right, that remain particularly challenging for these models.
- Europe > Ireland > Munster > County Limerick > Limerick (0.04)
- Europe > Ireland > Connaught > County Galway > Galway (0.04)
- Europe > Greece > Central Macedonia > Thessaloniki (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks (1.00)
Securing Large Language Models (LLMs) from Prompt Injection Attacks
Suri, Omar Farooq Khan, McCrae, John
Large Language Models (LLMs) are increasingly being deployed in real-world applications, but their flexibility exposes them to prompt injection attacks. These attacks leverage the model's instruction-following ability to make it perform malicious tasks. Recent work has proposed JATMO, a task-specific fine-tuning approach that trains non-instruction-tuned base models to perform a single function, thereby reducing susceptibility to adversarial instructions. In this study, we evaluate the robustness of JATMO against HOUYI, a genetic attack framework that systematically mutates and optimizes adversarial prompts. We adapt HOUYI by introducing custom fitness scoring, modified mutation logic, and a new harness for local model testing, enabling a more accurate assessment of defense effectiveness. We fine-tuned LLaMA 2-7B, Qwen1.5-4B, and Qwen1.5-0.5B models under the JATMO methodology and compared them with a fine-tuned GPT-3.5-Turbo baseline. Results show that while JATMO reduces attack success rates relative to instruction-tuned models, it does not fully prevent injections; adversaries exploiting multilingual cues or code-related disruptors still bypass defenses. We also observe a trade-off between generation quality and injection vulnerability, suggesting that better task performance often correlates with increased susceptibility. Our results highlight both the promise and limitations of fine-tuning-based defenses and point toward the need for layered, adversarially informed mitigation strategies.
Can Synthetic Data Improve Symbolic Regression Extrapolation Performance?
Ramlan, Fitria Wulandari, O'Riordan, Colm, Kronberger, Gabriel, McDermott, James
Many machine learning models perform well when making predictions within the training data range, but often struggle when required to extrapolate beyond it. Symbolic regression (SR) using genetic programming (GP) can generate flexible models but is prone to unreliable behaviour in extrapolation. This paper investigates whether adding synthetic data can help improve performance in such cases. We apply Kernel Density Estimation (KDE) to identify regions in the input space where the training data is sparse. Synthetic data is then generated in those regions using a knowledge distillation approach: a teacher model generates predictions on new input points, which are then used to train a student model. We evaluate this method across six benchmark datasets, using neural networks (NN), random forests (RF), and GP both as teacher models (to generate synthetic data) and as student models (trained on the augmented data). Results show that GP models can often improve when trained on synthetic data, especially in extrapolation areas. However, the improvement depends on the dataset and teacher model used. The most important improvements are observed when synthetic data from GPe is used to train GPp in extrapolation regions. Changes in interpolation areas show only slight changes. We also observe heterogeneous errors, where model performance varies across different regions of the input space. Overall, this approach offers a practical solution for better extrapolation. Note: An earlier version of this work appeared in the GECCO 2025 Workshop on Symbolic Regression. This arXiv version corrects several parts of the original submission.
- Europe > Ireland > Connaught > County Galway > Galway (0.04)
- Europe > Switzerland (0.04)
- Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
- Europe > Austria > Upper Austria (0.04)
- Research Report > New Finding (0.67)
- Research Report > Experimental Study (0.46)
MOMA-AC: A preference-driven actor-critic framework for continuous multi-objective multi-agent reinforcement learning
Callaghan, Adam, Mason, Karl, Mannion, Patrick
This paper addresses a critical gap in Multi-Objective Multi-Agent Reinforcement Learning (MOMARL) by introducing the first dedicated inner-loop actor-critic framework for continuous state and action spaces: Multi-Objective Multi-Agent Actor-Critic (MOMA-AC). Building on single-objective, single-agent algorithms, we instantiate this framework with Twin Delayed Deep Deterministic Policy Gradient (TD3) and Deep Deterministic Policy Gradient (DDPG), yielding MOMA-TD3 and MOMA-DDPG. The framework combines a multi-headed actor network, a centralised critic, and an objective preference-conditioning architecture, enabling a single neural network to encode the Pareto front of optimal trade-off policies for all agents across conflicting objectives in a continuous MOMARL setting. We also outline a natural test suite for continuous MOMARL by combining a pre-existing multi-agent single-objective physics simulator with its multi-objective single-agent counterpart. Evaluating cooperative locomotion tasks in this suite, we show that our framework achieves statistically significant improvements in expected utility and hypervolume relative to outer-loop and independent training baselines, while demonstrating stable scalability as the number of agents increases. These results establish our framework as a foundational step towards robust, scalable multi-objective policy learning in continuous multi-agent domains.
- Europe > Ireland > Connaught > County Galway > Galway (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Asia > Middle East > Jordan (0.04)
- Energy (0.93)
- Transportation > Ground > Road (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Aerial Assistance System for Automated Firefighting during Turntable Ladder Operations
Quenzel, Jan, Sekin, Valerij, Schleich, Daniel, Miller, Alexander, Stampa, Merlin, Pahlke, Norbert, Röhrig, Christof, Behnke, Sven
Fires in industrial facilities pose special challenges to firefighters, e.g., due to the sheer size and scale of the buildings. The resulting visual obstructions impair firefighting accuracy, further compounded by inaccurate assessments of the fire's location. Such imprecision simultaneously increases the overall damage and prolongs the fire-brigades operation unnecessarily. We propose an automated assistance system for firefighting using a motorized fire monitor on a turntable ladder with aerial support from an unmanned aerial vehicle (UAV). The UAV flies autonomously within an obstacle-free flight funnel derived from geodata, detecting and localizing heat sources. An operator supervises the operation on a handheld controller and selects a fire target in reach. After the selection, the UAV automatically plans and traverses between two triangulation poses for continued fire localization. Simultaneously, our system steers the fire monitor to ensure the water jet reaches the detected heat source. In preliminary tests, our assistance system successfully localized multiple heat sources and directed a water jet towards the fires.
- Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.05)
- Europe > Switzerland > Vaud (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- (3 more...)
Dual-Mode Deep Anomaly Detection for Medical Manufacturing: Structural Similarity and Feature Distance
Diaz, Julio Zanon, Siogkas, Georgios, Corcoran, Peter
Automated visual inspection in medical-device manufacturing faces unique challenges, including extremely low defect rates, limited annotated data, hardware restrictions on production lines, and the need for validated, explainable artificial-intelligence systems. This paper presents two attention-guided autoencoder architectures that address these constraints through complementary anomaly-detection strategies. The first employs a multi-scale structural-similarity (4-MS-SSIM) index for inline inspection, enabling interpretable, real-time defect detection on constrained hardware. The second applies a Mahalanobis-distance analysis of randomly reduced latent features for efficient feature-space monitoring and lifecycle verification. Both approaches share a lightweight backbone optimised for high-resolution imagery for typical manufacturing conditions. Evaluations on the Surface Seal Image (SSI) dataset-representing sterile-barrier packaging inspection-demonstrate that the proposed methods outperform reference baselines, including MOCCA, CPCAE, and RAG-PaDiM, under realistic industrial constraints. Cross-domain validation on the MVTec-Zipper benchmark confirms comparable accuracy to state-of-the-art anomaly-detection methods. The dual-mode framework integrates inline anomaly detection and supervisory monitoring, advancing explainable AI architectures toward greater reliability, observability, and lifecycle monitoring in safety-critical manufacturing environments. To facilitate reproducibility, the source code developed for the experiments has been released in the project repository, while the datasets were obtained from publicly available sources.
- North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
- Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- (3 more...)
- Information Technology > Security & Privacy (1.00)
- Law (0.93)
- Government > Regional Government (0.68)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Why aren't young people having sex any more?
Sexual activity in young people is on the decline, but why? And what's more, should we be worried about what this means for society and the future of the human race? The comedy film was released in 1973 with a largely youthful cast and one too many double entendres. Half a century later, that title seems more apt than ever, at least among the younger members of society. Over the past few decades, sex appears to have been on the decline among teenagers and young adults - but it's not just happening in Britain . In the US in 2010, 12 per cent of 18 to 29-year-olds reported not having had sex in the past year, according to the General Social Survey, a long-running sociological survey.
- Asia > Japan (0.06)
- Oceania > Australia (0.05)
- North America > United States > Virginia (0.05)
- (5 more...)
When retrieval outperforms generation: Dense evidence retrieval for scalable fake news detection
Qazi, Alamgir Munir, McCrae, John P., Nasir, Jamal Abdul
The proliferation of misinformation necessitates robust yet computationally efficient fact verification systems. While current state-of-the-art approaches leverage Large Language Models (LLMs) for generating explanatory rationales, these methods face significant computational barriers and hallucination risks in real-world deployments. We present DeReC (Dense Retrieval Classification), a lightweight framework that demonstrates how general-purpose text embeddings can effectively replace autoregressive LLM-based approaches in fact verification tasks. By combining dense retrieval with specialized classification, our system achieves better accuracy while being significantly more efficient. DeReC outperforms explanation-generating LLMs in efficiency, reducing runtime by 95% on RAWFC (23 minutes 36 seconds compared to 454 minutes 12 seconds) and by 92% on LIAR-RAW (134 minutes 14 seconds compared to 1692 minutes 23 seconds), showcasing its effectiveness across varying dataset sizes. On the RAWFC dataset, DeReC achieves an F1 score of 65.58%, surpassing the state-of-the-art method L-Defense (61.20%). Our results demonstrate that carefully engineered retrieval-based systems can match or exceed LLM performance in specialized tasks while being significantly more practical for real-world deployment.
- Research Report > New Finding (1.00)
- Research Report > Promising Solution (0.68)
Climate Adaptation with Reinforcement Learning: Economic vs. Quality of Life Adaptation Pathways
Costa, Miguel, Vandervoort, Arthur, Drews, Martin, Morrissey, Karyn, Pereira, Francisco C.
Climate change will cause an increase in the frequency and severity of flood events, prompting the need for cohesive adaptation policymaking. Designing effective adaptation policies, however, depends on managing the uncertainty of long-term climate impacts. Meanwhile, such policies can feature important normative choices that are not always made explicit. We propose that Reinforcement Learning (RL) can be a useful tool to both identify adaptation pathways under uncertain conditions while it also allows for the explicit modelling (and consequent comparison) of different adaptation priorities (e.g. economic vs. wellbeing). We use an Integrated Assessment Model (IAM) to link together a rainfall and flood model, and compute the impacts of flooding in terms of quality of life (QoL), transportation, and infrastructure damage. Our results show that models prioritising QoL over economic impacts results in more adaptation spending as well as a more even distribution of spending over the study area, highlighting the extent to which such normative assumptions can alter adaptation policy. Our framework is publicly available: https://github.com/MLSM-at-DTU/maat_qol_framework.
- Europe > Denmark > Capital Region > Copenhagen (0.06)
- North America > United States > Washington (0.04)
- Europe > Switzerland > Geneva > Geneva (0.04)
- (3 more...)
- Transportation > Infrastructure & Services (0.94)
- Transportation > Ground > Road (0.69)
- Health & Medicine (0.64)
3D CT-Based Coronary Calcium Assessment: A Feature-Driven Machine Learning Framework
Abaid, Ayman, Guidone, Gianpiero, Alsubai, Sara, Alquahtani, Foziyah, Iqbal, Talha, Sharif, Ruth, Elzomor, Hesham, Bianchini, Emiliano, Almagal, Naeif, Madden, Michael G., Sharif, Faisal, Ullah, Ihsan
Coronary artery calcium (CAC) scoring plays a crucial role in the early detection and risk stratification of coronary artery disease (CAD). In this study, we focus on non-contrast coronary computed tomography angiography (CCTA) scans, which are commonly used for early calcification detection in clinical settings. To address the challenge of limited annotated data, we propose a radiomics-based pipeline that leverages pseudo-labeling to generate training labels, thereby eliminating the need for expert-defined segmentations. Additionally, we explore the use of pretrained foundation models, specifically CT-FM and RadImageNet, to extract image features, which are then used with traditional classifiers. We compare the performance of these deep learning features with that of radiomics features. Evaluation is conducted on a clinical CCTA dataset comprising 182 patients, where individuals are classified into two groups: zero versus non-zero calcium scores. We further investigate the impact of training on non-contrast datasets versus combined contrast and non-contrast datasets, with testing performed only on non-contrast scans. Results show that radiomics-based models significantly outperform CNN-derived embeddings from foundation models (achieving 84% accuracy and p<0.05), despite the unavailability of expert annotations.
- Europe > Ireland > Connaught > County Galway > Galway (0.05)
- Oceania > Australia (0.05)
- North America > Greenland (0.04)
- Asia > Middle East > Saudi Arabia (0.04)